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What’s inside 


Python is a widely used programming language due to its simplicity and ease of 
development. It presents an exciting prospect to software developers. From the reversers 


point of view, reversing such an application presents different challenges which this 
document is going to explore. 


Reversing a frozen python executable 


Abstract 


Python is a general purpose, open source computer programming language. Among other things; Python sports a remarkably simple, read- 
able and maintainable syntax that makes software development simplied. It’s main asset is that it can be used to program complex systems without 
a necessarily complex codebase. It’s this feature that makes Python among the top five programming languages used in the world today. Python is 
deployed in a variety of products. It’s current userbase includes the Internet giant Google, YouTube, Industrial Light & Magic, NASA, BitTorrent, Skype, 
Dropbox etc. 


There is a growing tendency of software developers to program their software in Python as for the reasons previously described. However 
Python is a scripted language unlike compiled languages like C or C++. This means that the code is interpreted every time it is run. This presents a 
major bottleneck in software deployment. If it is necessary to ship the source code with every distribution, then copyrighted software cannot exist. 
Everything is as good as Open-Source, free for every one to modify and use. 


In this context, there has already been some work to protect the software developed in python. Most of these solutions typically take the py- 
thon source code, compile it to a .pyc or .pyo file, and then embed these compiled files into a native executable for the target platform. The python 
runtime along with the necessary libraries is also embeded inside the executable. Whenever the executable is run, a stub (embeded inside) starts the 
python runtime which in turns begins executing the main program. 


For us, the reversers, reverse engineering such an application is different from reversing a compiled application written in C/C++ ,Delphi, Visual 
Basic or even Assembly language. Since python is an interpreted language, the code runs in a Virtual Machine. If we do not have access to the source 
code then it is very difficult to reverse such an application from within the VM level. So the primary step in such a reversing endeavour is reconstruct- 
ing the source code from the executable. Once this step is over, the remaining becomes a trivial task. 
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Our Toolset... 


Out target Athtek Digiband from http://www.athtek.com/digiband. html 

Python version 2.6 and 2.7 from http://www.python.org/download/ 

PE analyzers such as peid, exeinfope, Detect It Easy, RDG packer detector from http://tuts4you.com/download.php?list.37 
Ollydbg from http://www.ollydbg.de/version2.html 

Pylnstatller from http://www.pyinstaller.org/ 

uncompyle2 from https://github.com/Mysterie/uncompyle2 

Any Hex Editor like HxD from http://mh-nexus.de/en/hxd/ 


Athtek Digiband is our target which we would study. 


Python is required as the reversing session requires us to run scripts-which will run only python is installed. Two different versions are required as code 
intended for one version may not run correctly one the other. 


PE Analyzers as usual are plenty in number. This document depicts the use of some but feel free to use another. 
Ollydbg - the de-facto tool for reversing on the win32 platform. 
The use of pyinstaller and uncompyle2 will become clear as we progress. For now we can download a copy of these two tools. 


Lastly any hex editor would do. Here the screenshots are from the popular free HexHeditor HxD. 


Further, although a pre existing knowledge of programming in python is not necessary but possessing it would definitely help. 
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Target 


The target intended for our research is Athtek Digiband. It is a software 
which can automatically generate music. This is what is said about the software on 
the website 


“AthTek DigiBand is a piece of automatic music composition software for Windows. It 
can automatically compose music with flexible instruments and melodies. It can also 
improvise an accompaniment to imported audio files, live computer keyboard play- 
ing or humming. Rich instruments and music styles are integrated to this brilliant 
music software. With AthTek DigiBand, you will enjoy the fun of having a versatile 
music group on the computer. 


For people who are not musical, the process of composing and playing music seems 
almost magical. AthTek DigiBand is a program that aims to bring the joys of musi- 
cal creation to the tone-deaf masses. With this easy-to-use music software, you can 
just follow the instructions to compose music or improvise accompaniments. You can 
even convert your created music into midi notation in AthTek DigiBand.” 


2 
© Restrictions 


Function 
Number of instrument tracks 
Number of preset structure types 

Number of genres 

Open project 
Update band data 

Export midi 

Export wave 


Customize structure 


@ * Dig@and ver L3.bets - D:\Program Files\Athtek\DigiSand\seve\New yoy 
File Toot View Play Export Help Purchase 
DO SH @ orita caste EBB Acconprninent Guide 
| Blank Verse ti 


The latest version available at the time of writing is 
version 1.7. We will be reversing this application to sudy the 
method to deal with such applications. We will extract the 
source code from this application but we will NOT bypass the 
registration for reasons evident. In fact by following this tuto- 
rial we can even recompile the application. 

The trial version has many limitations as per the graph- 
ic. It is infact possible to remove this limitations but as said 
before, that is not the purpose of this document. Moreover 
doing so would harm the developers. 
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The first step in reversing as usual is a detailed stu 
load the binary into some of analyzers and see the results. 


r 


pe ; Linker Info : 7,10 SubSystem : Windows GUI 


Entry Point : go006C6B 00 B EP Section : 


File Offset; o0006C6B First Bytes : 64,60.68,08,2E 


File: C:\Program Files\AthTek\DigiBand 1.7\DigiBand.exe File Size : 012A5E51h aj Overlay : 0127EE51 


Image is 32bit executable RES/OYL:0 / 99% 2010 Exit | 


Entrypoint: 00006C66 EP Section: text Microsoft Visual C++ ver. 7.1 EXE (3 bytes sign - easy to fake) - no se G) Rio] 
Lamer Info - Help Hint - Unpack info a Rip J 


File Offset: O0006C6B First Bytes: 64,60,68,08 == 
z —— Not packed , try disA45M OllyDbg - www.ollydbg.de or WD32dsm89.ex:| | | >> | 


Linker Info: 7,10 Subsystem: Win32 GUI 


Microsoft Visual C++ 7.0 [Overlay] 
Multi Scan Task viewer | Options 


C:\Program Files 4thT ek\DigiB and 1.?\DigiB and.exe 
@ 


IV Stay on top 
Microsoft Visual C++ 7.0 Compiler 

Nothing Detected 

—>-—- Check Referencia a CD-Rom Possible 
Scan Scripts Log 


Contact: 


File Name: C: {Program Files/AthTek/DigiBand 1,7/DigiBand.exe 


Type: PE Size: 19553873 Entropy H 


Stub Import Resource Overlay PE @ File scanned in 33.25 Seg. O M-A 


EntryPoint: o0006c6b > ImageBase: an4nnnnn 


Entropy(bits/byte): 7.99721 H Save Diagre 


NumberOfSections: 0004 > SizeOfImage: 


8 
compiler Microsoft Visual Studio(?.0)[C++/MFC] 


linker MS Linker(7,10)[EXE32] 


% compressed 


Results : 
Compiler: Visual C++ 
Entropy scan: File is packed or encrypted Ẹ Entropy: 8.00 


The file is compressed or encrypted 


E % not compressed 


Analyzing the results 


The results we got so far is not encouraging. All the tools used for 
analysis reported Microsoft Visual C++ 7.0 as the compiler. No packer or pro- 
tector or detected. However entropy scanning reports that the file is packed 
or encrypted. So either the file is protected by some advanced protectors 
which the world has not yet heard of (like a private protector specifically 
developed by the company which is improbable) or it is too simple that it 
evades common methods of detection. 

In such a confusing situation a useful tool comes to our rescue. It is 
the Cryptographic Analyzer plugin aka Kanal for Peid. So lets load it in the 
tool and see what it has to say. 


*a KANAL v2.90 


C:\Program Files\AthTek\DigiBand 1.7\DigiBan 


ADLERS2:: OOOOS2E3:; 004052E3 
ADLERS2 :: QOOOS3AD :; 0O40534D 
CRC32:; 00010AES :: OO4104E8 

ZLIB deflate [word] :: OOO109ES8 :: 004109E8 


About Export... | 


Detected 4 crypto signatures (in 3.5s) 


Kanal reports 4 crypto signatures. Among them the one that is spe- 
cifically interesting is the ZLIB signature. ZLIB is a deflate compression al- 
gorithm commonly used to compress random length data blocks and it is 
efficient in doing so. 

So it looks like we have found our packer. The file in all probability 
contains data compressed by ZLIB. 


Debugging the application in Ollydbg reveals other useful of 
information. Doing a string search reveals the presence of the python 
string implying the application was developed in Python. But the file 
we are running is not a py file, it is an exe file. So the python script was 
actually converted to an exe file which we ran. 


Another feature is the parent process starts up a child process 
which in turn loads the main GUI. The parent process meanwhile 
holds a mutex waiting for the child to terminate, and once the child 
terminates, the parent also terminate. This characteristics can be eas- 
ily detected if we use the latest version of Ollydbg (v 2.01) and in Op- 
tions turn on Debug Child Process. 


dwm.exe 


With these features in mind let us search popular py to exe 
tools on the Internet. The results found are py2exe, pyinstaller, cx- 
freeze, bbfreeze etc. 


It is now time to read the documentation on these tools, to find 
out which of the tool has the similar set of features as we found out. 
This is a monotonous task which all reversers try to avoid, but in our 
case we have no other option. So lets do the unavoidable. 
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Finding the python to exe tool 


At this point (after reading the necessary documentations) we can zero in on pyinstaller as the tool used to generate the exe. 
The relevant documentation (which strongly supports our assumption) copied from the pyinstaller manual is presented below : 


One Pass Execution 


In a single directory deployment (--onedir, which is the default), all of the binaries are already in the file system. In that case, the embedding app: f 


opens the archive 

starts Python (on Windows, this is done with dynamic loading so one embedding app binary can be used with any Python version) 
imports all the modules which are at the top level of the archive (basically, bootstraps the import hooks) 

mounts the ZlibArchive(s) in the outer archive 

¢ runs all the scripts which are at the top level of the archive 

¢ finalizes Python 


Two Pass Execution 


There are a couple situations which require two passes: 


e a --onefile deployment (on Windows, the files can't be cleaned up afterwards because Python does not call FreeLibrary; on other platformg 


extracted in the same process that uses them) 
* LD_LIBRARY_PATH needs to be set to find the binaries (not extension modules, but modules the extensions are linked to). 


The first pass: 


¢ opens the archive 
e extracts all the binaries in the archive (in PyInstaller 2.0, this is always to a temporary directory). 
e sets a magic environment variable 
e sets LD_LIBRARY_PATH (non-Windows) 
executes itself as a child process (letting the child use his stdin, stdout and stderr) 
waits for the child to exit (on “nix, the child actually replaces the parent) 
cleans up the extracted binaries (so on “nix, this is done by the child) 
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Unpacking the executable 


As per the documentation it is mentioned that the executable 
contains an embedded ZLIB compressed archive. This ZLIB archive con- 
tains the python code which we are after. Further it is mentioned, that 
execution is done in two steps, the first step decompresses the archive 
to a temporary directory; and in the second this decompressed code 
is executed by a child process. When the child terminates the parent 
cleans up the extracted files in the temporary directory. So this defini- 
tion nicely fits in the behaviour we previously obtained. However one 
thing we are still not sure is whether the decompressed ZLIB archive 
contains the actual python code in human readable form. 

So we have to find a way to grab the ZLIB archive and decom- 
press it. We are in luck as pyinstaller itself ships with a tool (actually a 
python script) which can be used to inspect the contents of an execut- 
able produced by it. The script is ArchiveViewer.py and can be found 
in the utils directory within the pyinstaller distribution. So lets run the 
script on the executable and await the results. 


ArchiveViewer lets you examine the contents of any archive build with PyInstalle 
executable (PYZ, PKG or exe). Invoke it with the target as the first arg (It has b 


as a Send-To so it shows on the context menu in Explorer). The archive can be 
using these commands: 


O <nm> 

Open the embedded archive <nm> (will prompt if omitted). 
U 

Go up one level (go back to viewing the embedding archive). 
X <nm> 


Extract nm (will prompt if omitted). Prompts for output filename. If nor 


extracted to stdout. 


In the script output we can find references to files such as 
python26.dll, PyQt.pyd etc. This implies python version 2.6 was used 
with Qt being the User Interface frameworks. 

Another thing worth investigating is the outPYZ1.pyz file. Look- 
ing in the documentation reveals that the pyz file is a ZLIB compressed 
archive which contains the compiled version (i.e. a .pyc or .pyo) of the 
python source code (.py file). 

Moreover the provided script ArchiveViewer.py can be used to view 
and extract such archive. The usage of the script is shown below and is 
also documented in the manual. 


ct C:\WINDOWS\system32\cmd.exe - python utils\ArchiveViewer. py DigiBand.e 


Quit. U S A G E 
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Peering inside the PYZ 


Extracting the pyz file and then running ArchiveViewer.py on the file So lets load the file register in a Hex-Editor to see its contents. 


results in — 
iS) register 


ct C:\WINDOWS\system32\cmd.exe Offset (h) 


oo000000 
00000010 
00000020 
00000030 
00000040 
00000050 
00000060 
00000070 
00000080 
00000090 
00000040 
00000080 


The file starts with 0x63. Now this file must be a compiled 
python file. However, a compiled python file (.pyc) file starts with a 
differrent magic header. If we open a .pyc file from python version 2.6 
we will see that the header is of the following format. 


struct header 
Now things looks promising, it seems the files which were inside the { 


pyz conatins the program code. However this should not be human DWORD magic; 
readble, since as per the docs the pyz file contains compiled python DWORD timestamp; 
code(i.e. .pyc or .pyo). We have to find a way to decompile these $ 
files back to original python source code (.py file). 
So lets extract any such file within the archive such as the For version 2.6 the magic is hardcoded to be D1 F2 OD OA. 
register file present at an offset of 1949910L inside the archive and The timestamp is the time on which this pyc file was created. The py- 
see what's inside. thon compiler uses this value to check if the pyc is older than the cor- 
responding py source and accordingly recompile. 
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Decompiling the PYC 


So in order to make python recognize the pyc file we have to The usage of the decompiler is well documented in the readme 
insert the header bytes in the file before we can proceed. So load file that comes with it. So let’s install it and run on the pyc. 
the file in a hex editor and append the following bytes (D1 F2 OD 
OA 00 00 00 00) at the beginning of the file. Here the first 4 bytes 
as already said is the hardcoded magic value, and the last 4 is the C:\WINDOWS\system32\cmd.exe 
timestamp which can be anything. After insertion we save and re- 
name the file as a .pyc file. 
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Now, let’s open the produced py file with our fingers crossed. 
The script says that everything was sucessful. The results are shown on 
the page after. 
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Now if we run the file, then python will recognize it but 
ONLY python version 2.6 will recognize. Any other version will re- 
port as Invalid Value of Magic header. 


Now the final step in this reversing endeveour is decompil- 
ing this pyc file back to its py source. Again we resort to the power 
of world wide web. Searching for python 2.6 decompilers, | came 
across quite a few of them. | tested each but was unsuccessful in 
majority of them. So in this document, | am only mentioning the i 
one which I was sucessful with. It’s the the uncompyle2 decompiler. 
It can decompile python version 2.5, 2.6 & 2.7. However it will run 
ONLY on version 2.7. 
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And the final results 


# Embedded file name: F:\project\youband verl.6\build\pyi-win32\digi\outPYZ1.pyz/registe 


Module implementing RegisterDialog. 
PyQt4 .QtGui QDialog, QMessageBox, QColor, QPixmap, QGridLayout, QTextEdit 
PyQt4 .QtCore pyqtSignature, QObject, SIGNAL, Qt, QSize, QRect, QCoreApplication 
Ui_RegisterGulI Ui_Dialog 
Ui_VerdiffGuI Ui_VerdiffDialog 
TEXTS 
wmi 
time 
baseé4 
traceback 
urllib, os, socket 
SUPERKEY = (‘°SD130901GOTDZ4W3P1D", ‘SDI3TESTGOTDZ4W3P1D") 


register (QObject) : 


(self, parent): 
super (register, self). init (parent) 
self.father = parent 
TEXTS .SOFT_VERSION — TEXTS.TRAIL VER: 
self.machine_code_list = ['112233445566"] 
self.machine_code_list = self.getMachineCodeList () 
TEXTS.LANGUAGE — TEXTS.CN VER: 
self.forbidden_web = "htt inniaoniao.com/register/youband/check forbid.php?code=ts" $% 
TEXTS.LANGUAGE — TEXTS.JP_VER: 
self.forbidden_web = "htt inniaoniao.c 1p/register/youband/check forbid.php?code=%s 


And it’s SUCCESS!. The tool has indeed done a great job and in addition it has also generated the documentation strings which makes the revers- 
er’s job easier. We have succeeded to extract the original source code from the executable. We can even recompile the application from the generated 
sources, but as already said it is not the purpose of this document. With this, we come to the end of this mini tutorial. Hope you use your newly gained 
knowledge in a positive way. 


This is extremecoders signing off, Ciao! 
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